This report summarizes the results of the Month 7 multivariable marker superlearner modeling analysis of vaccine recipients for the HVTN 505 HIV vaccine efficacy trial. This analyses and report will be updated once the ELISpot Any Env and ADCP Mosaic markers are available.

Table 0.1 shows the 28 learner-screen combinations fed into the Superlearner. Table 2 shows the variable sets that were used as input feature sets in the Superlearning. The first variable set, baseline risk factors, is taken to be the same baseline factors adjusted for in the other correlates objectives of the SAP (RSA, Age, BMI and baseline risk score). For each set of Month 7 markers, both primary and exploratory markers are included. This is done given the objective of this machine learning analysis to be maximally inclusive and unbiased, including all of the primary and exploratory Month 7 immune markers. In addition, all Month 7 individual markers that are constituents for defining one or more of the 12 markers are included; for example the antigen-specificbreadth score variables aggregate over readouts to a set of antigens. Therefore, for example, the variable set “All BAMA IgG3 gp140 markers” in Table 2 includes all individual antigen IgG3 gp140 markers as well as inclding the IgG3 gp140 breadth score marker.

For each variable set, a point and 95% confidence interval estimate of CV-AUC from the superlearner model fit is used to summarize classification accuracy (Table 3 and Figure 1).

The Appendix section of the report shows the results (forest plots, ROC curves and predicted probability plots) for each of the 15 variable sets in order of their performance CV-AUC.

=200

Table 0.1: All learner-screen combinations (28 in total) used as input to the Superlearner.
Learner Screen
SL.mean all
SL.bayesglm all
SL.bayesglm glmnet
SL.bayesglm univar_logistic_pval
SL.bayesglm highcor_random
SL.gam glmnet
SL.gam univar_logistic_pval
SL.gam highcor_random
SL.glm all
SL.glm glmnet
SL.glm univar_logistic_pval
SL.glm highcor_random
SL.glm.interaction all
SL.glm.interaction glmnet
SL.glm.interaction univar_logistic_pval
SL.glm.interaction highcor_random
SL.glmnet.1 all
SL.ksvm.polydot glmnet
SL.ksvm.polydot univar_logistic_pval
SL.ksvm.polydot highcor_random
SL.ksvm.rbfdot glmnet
SL.ksvm.rbfdot univar_logistic_pval
SL.ksvm.rbfdot highcor_random
SL.polymars glmnet
SL.polymars univar_logistic_pval
SL.polymars highcor_random
SL.xgboost.4.no all
SL.ranger.no all
Table 0.2: All variable sets (15 in total) with immunological markers for which Superlearner was run.
Variable Set Name Variables included in the set
1_baselineRiskFactors Baseline risk factors only (Reference model)
2_M7_ELISA Baseline risk factors + M7 ELISA
4_M7_ADCP Baseline risk factors + M7 ADCP
5_M7_IgG3 Baseline risk factors + M7 IgG3
6_M7_IgG3gp140 Baseline risk factors + M7 IgG3 gp140
7_M7_IgG3gp120 Baseline risk factors + M7 IgG3 gp120
8_M7_IgG3V1V2 Baseline risk factors + M7 IgG3 V1V2
9_M7_IgG3gp41 Baseline risk factors + M7 IgG3 gp41
10_M7_IgG3bScores Baseline risk factors + M7 IgG3 Breadth Scores
11_M7_IgG3multi Baseline risk factors + M7 IgG3 Multi-Epitope breadth
12_M7_IgG3overall Baseline risk factors + M7 Overall score across assays
14_2+4 Baseline risk factors + M7 ELISA + M7 ADCP
15_2+5 Baseline risk factors + M7 ELISA + M7 IgG3
18_4+5 Baseline risk factors + M7 ADCP + M7 IgG3
22_2+4+5 Baseline risk factors + M7 ELISA + M7 ADCP + M7 IgG3
Table 0.3: Superlearner performance across all 15 variable sets sorted by weighted CV-AUC performance.
Variable set CV-AUC (95% CI)
1_baselineFactors 0.584 [0.483, 0.685]
10_M7_IgG3bScores 0.524 [0.419, 0.629]
4_M7_ADCP 0.523 [0.419, 0.627]
6_M7_IgG3gp140 0.521 [0.421, 0.622]
5_M7_IgG3 0.518 [0.416, 0.620]
12_M7_IgG3overall 0.516 [0.412, 0.620]
9_M7_IgG3gp41 0.510 [0.404, 0.615]
8_M7_IgG3V1V2 0.508 [0.404, 0.612]
18_4+5 0.505 [0.404, 0.606]
2_M7_ELISA 0.505 [0.400, 0.609]
22_2+4+5 0.501 [0.399, 0.603]
14_2+4 0.498 [0.391, 0.604]
11_M7_IgG3multi 0.495 [0.392, 0.598]
15_2+5 0.495 [0.393, 0.597]
7_M7_IgG3gp120 0.491 [0.386, 0.595]
Forest plot showing Superlearner performance (weighted CV-AUC with 95\% CI) across all 15 variable sets.

Figure 0.1: Forest plot showing Superlearner performance (weighted CV-AUC with 95% CI) across all 15 variable sets.

1 Appendix

Forest plots, ROC curves and predicted probability plots are shown for each variable set.

Variable set ``1\_baselineFactors'': Weighted CV-AUC (95\% CI) of algorithms for predicting HIV disease status after Day 210

Figure 1.1: Variable set ``1_baselineFactors’’: Weighted CV-AUC (95% CI) of algorithms for predicting HIV disease status after Day 210

Variable set ``1\_baselineFactors'': Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Figure 1.2: Variable set ``1_baselineFactors’’: Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Variable set ``1\_baselineFactors'': Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Figure 1.3: Variable set ``1_baselineFactors’’: Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Variable set ``10\_M7\_IgG3bScores'': Weighted CV-AUC (95\% CI) of algorithms for predicting HIV disease status after Day 210

Figure 1.4: Variable set ``10_M7_IgG3bScores’’: Weighted CV-AUC (95% CI) of algorithms for predicting HIV disease status after Day 210

Variable set ``10\_M7\_IgG3bScores'': Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Figure 1.5: Variable set ``10_M7_IgG3bScores’’: Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Variable set ``10\_M7\_IgG3bScores'': Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Figure 1.6: Variable set ``10_M7_IgG3bScores’’: Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Variable set ``4\_M7\_ADCP'': Weighted CV-AUC (95\% CI) of algorithms for predicting HIV disease status after Day 210

Figure 1.7: Variable set ``4_M7_ADCP’’: Weighted CV-AUC (95% CI) of algorithms for predicting HIV disease status after Day 210

Variable set ``4\_M7\_ADCP'': Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Figure 1.8: Variable set ``4_M7_ADCP’’: Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Variable set ``4\_M7\_ADCP'': Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Figure 1.9: Variable set ``4_M7_ADCP’’: Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Variable set ``6\_M7\_IgG3gp140'': Weighted CV-AUC (95\% CI) of algorithms for predicting HIV disease status after Day 210

Figure 1.10: Variable set ``6_M7_IgG3gp140’’: Weighted CV-AUC (95% CI) of algorithms for predicting HIV disease status after Day 210

Variable set ``6\_M7\_IgG3gp140'': Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Figure 1.11: Variable set ``6_M7_IgG3gp140’’: Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Variable set ``6\_M7\_IgG3gp140'': Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Figure 1.12: Variable set ``6_M7_IgG3gp140’’: Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Variable set ``5\_M7\_IgG3'': Weighted CV-AUC (95\% CI) of algorithms for predicting HIV disease status after Day 210

Figure 1.13: Variable set ``5_M7_IgG3’’: Weighted CV-AUC (95% CI) of algorithms for predicting HIV disease status after Day 210

Variable set ``5\_M7\_IgG3'': Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Figure 1.14: Variable set ``5_M7_IgG3’’: Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Variable set ``5\_M7\_IgG3'': Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Figure 1.15: Variable set ``5_M7_IgG3’’: Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Variable set ``12\_M7\_IgG3overall'': Weighted CV-AUC (95\% CI) of algorithms for predicting HIV disease status after Day 210

Figure 1.16: Variable set ``12_M7_IgG3overall’’: Weighted CV-AUC (95% CI) of algorithms for predicting HIV disease status after Day 210

Variable set ``12\_M7\_IgG3overall'': Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Figure 1.17: Variable set ``12_M7_IgG3overall’’: Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Variable set ``12\_M7\_IgG3overall'': Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Figure 1.18: Variable set ``12_M7_IgG3overall’’: Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Variable set ``9\_M7\_IgG3gp41'': Weighted CV-AUC (95\% CI) of algorithms for predicting HIV disease status after Day 210

Figure 1.19: Variable set ``9_M7_IgG3gp41’’: Weighted CV-AUC (95% CI) of algorithms for predicting HIV disease status after Day 210

Variable set ``9\_M7\_IgG3gp41'': Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Figure 1.20: Variable set ``9_M7_IgG3gp41’’: Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Variable set ``9\_M7\_IgG3gp41'': Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Figure 1.21: Variable set ``9_M7_IgG3gp41’’: Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Variable set ``8\_M7\_IgG3V1V2'': Weighted CV-AUC (95\% CI) of algorithms for predicting HIV disease status after Day 210

Figure 1.22: Variable set ``8_M7_IgG3V1V2’’: Weighted CV-AUC (95% CI) of algorithms for predicting HIV disease status after Day 210

Variable set ``8\_M7\_IgG3V1V2'': Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Figure 1.23: Variable set ``8_M7_IgG3V1V2’’: Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Variable set ``8\_M7\_IgG3V1V2'': Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Figure 1.24: Variable set ``8_M7_IgG3V1V2’’: Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Variable set ``18\_4+5'': Weighted CV-AUC (95\% CI) of algorithms for predicting HIV disease status after Day 210

Figure 1.25: Variable set ``18_4+5’’: Weighted CV-AUC (95% CI) of algorithms for predicting HIV disease status after Day 210

Variable set ``18\_4+5'': Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Figure 1.26: Variable set ``18_4+5’’: Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Variable set ``18\_4+5'': Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Figure 1.27: Variable set ``18_4+5’’: Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Variable set ``2\_M7\_ELISA'': Weighted CV-AUC (95\% CI) of algorithms for predicting HIV disease status after Day 210

Figure 1.28: Variable set ``2_M7_ELISA’’: Weighted CV-AUC (95% CI) of algorithms for predicting HIV disease status after Day 210

Variable set ``2\_M7\_ELISA'': Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Figure 1.29: Variable set ``2_M7_ELISA’’: Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Variable set ``2\_M7\_ELISA'': Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Figure 1.30: Variable set ``2_M7_ELISA’’: Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Variable set ``22\_2+4+5'': Weighted CV-AUC (95\% CI) of algorithms for predicting HIV disease status after Day 210

Figure 1.31: Variable set ``22_2+4+5’’: Weighted CV-AUC (95% CI) of algorithms for predicting HIV disease status after Day 210

Variable set ``22\_2+4+5'': Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Figure 1.32: Variable set ``22_2+4+5’’: Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Variable set ``22\_2+4+5'': Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Figure 1.33: Variable set ``22_2+4+5’’: Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Variable set ``14\_2+4'': Weighted CV-AUC (95\% CI) of algorithms for predicting HIV disease status after Day 210

Figure 1.34: Variable set ``14_2+4’’: Weighted CV-AUC (95% CI) of algorithms for predicting HIV disease status after Day 210

Variable set ``14\_2+4'': Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Figure 1.35: Variable set ``14_2+4’’: Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Variable set ``14\_2+4'': Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Figure 1.36: Variable set ``14_2+4’’: Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Variable set ``11\_M7\_IgG3multi'': Weighted CV-AUC (95\% CI) of algorithms for predicting HIV disease status after Day 210

Figure 1.37: Variable set ``11_M7_IgG3multi’’: Weighted CV-AUC (95% CI) of algorithms for predicting HIV disease status after Day 210

Variable set ``11\_M7\_IgG3multi'': Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Figure 1.38: Variable set ``11_M7_IgG3multi’’: Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Variable set ``11\_M7\_IgG3multi'': Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Figure 1.39: Variable set ``11_M7_IgG3multi’’: Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Variable set ``15\_2+5'': Weighted CV-AUC (95\% CI) of algorithms for predicting HIV disease status after Day 210

Figure 1.40: Variable set ``15_2+5’’: Weighted CV-AUC (95% CI) of algorithms for predicting HIV disease status after Day 210

Variable set ``15\_2+5'': Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Figure 1.41: Variable set ``15_2+5’’: Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Variable set ``15\_2+5'': Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Figure 1.42: Variable set ``15_2+5’’: Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Variable set ``7\_M7\_IgG3gp120'': Weighted CV-AUC (95\% CI) of algorithms for predicting HIV disease status after Day 210

Figure 1.43: Variable set ``7_M7_IgG3gp120’’: Weighted CV-AUC (95% CI) of algorithms for predicting HIV disease status after Day 210

Variable set ``7\_M7\_IgG3gp120'': Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Figure 1.44: Variable set ``7_M7_IgG3gp120’’: Weighted CV-AUC ROC curves of top two individual learners along with Superlearner and discrete-SL.

Variable set ``7\_M7\_IgG3gp120'': Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.

Figure 1.45: Variable set ``7_M7_IgG3gp120’’: Weighted prediction probability plots of top two individual learners along with Superlearner and discrete-SL.